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ABSTRACT 


This paper proposes a comparison of machine learning (ML) algorithm 
known as the k-nearest neighbor (KNN) and naive Bayes (NB) in identifying 
and diagnosing the harmonic sources in the power system. A single-point 
measurement is applied in this proposed method, and using the S-transform 
the measurement signals are analyzed and extracted into voltage and current 


parameters. The voltage and current features that estimated from time-frequency 

representation (TFR) of S-transform analysis are used as the input for MLs. 
Keywords: Four significant cases of harmonic source location are considered, whereas 
harmonic voltage (Hy) and harmonic current (Hc) source type-load are used 
in the diagnosing process. To identify the best ML, the performance 
measurement of the proposed method including the accuracy, precision, 
specificity, sensitivity, and F-measure are calculated. The sufficiency of 
the proposed methodology is tested and verified on IEEE 4-bust test feeder 
and each ML algorithm is executed for 10 times due to prevent any 
overfitting result. 
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1. INTRODUCTION 

Today the power quality issue such as harmonic signal has turn into interesting study cross-disciplinary 
areas, combination of power electronic, power engineering, digital signal processing, artificial intelligence 
and embedded system [1]-[3]. Harmonics have turn into an important power quality problematic since 
the distortion level in the power system is increased due to this issue [4], [5]. The harmonic pollution at 
the point of common coupling (PCC) is the consequence of multiple harmonic sources include non-linear 
load that connected to the power network system, whereas the injected harmonic components in the power 
system may caused hardware failure and malfunction of sensitive loads [6], [7]. Therefore, diagnosis, 
identification, monitoring of harmonic sources become main concern in the power systems [8]. When the harmonic 
source is identified, its effects on power system can be studied and the proper mitigation methods shall 
be implemented [9]. Thus, an identification of harmonic source is important and numerous techniques 
have been proposed by experts due to identify harmonic sources dependent on various hypothetical and 
advantages [10], [11]. Using random probability distribution of data with fast Fourier transform (FFT) 
analysis is proposed in [12] due to identify type of harmonic sources. 
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However, the FFT is not able to accommodate non-stationary signal, which is the changes 
of spectral characteristic in time [13], [14]. A knowledge-based system namely, fuzzy logic, neural network 
and machine learning are computer programs that simulate and imitate the abilities of decision-making 
abilities of human experts within a specified field [2], [15]. An approach using a fuzzy logic (FL) using 
If-Then rule for identifying and diagnosing are proposed in [16]-[18]. In order to validate the rule-base, 
all possible rules are examined [19], [20]. Nevertheless, several restrictions of FL including: require 
enormous number of rules in the information base makes the framework become clumsy and confounds its 
support particularly on account of unpretentious updates, and the trouble in appointing certainty rating 
to each rule [21]-[23]. The harmonic source identification using artificial neural networks (ANN) method 
to the problem of harmonic load diagnosis has some practical issue and they are discussed in [24], [25]. 
ANN has an ability to study the mathematical relationship of series input and output variables which are 
independent or predictor variables [26], [27]. However, the ANN model suffers from overfitting of a training 
data set and bad performance in external test data sets [28], [29]. Currently, one of artificial intelligence sub 
set which is the machine learning (ML) has become one of the crucial methods in the identification method 
[30]. Various literatures were stated, that the most used and satisfactory performance of machine learning 
algorithms are k-nearest neighbor (KNN), naive Bayes (NB), support vector machine (SVM), and linear 
discriminate analysis (LDA) in classifying and diagnosis purpose [30]. Although many methods are proposed 
in the identification of harmonic source, it is still not accurate and fast enough to identify the harmonic 
sources that connected to the power system network. A good digital signal processing (DSP) technique 
is require to process input signals that are used in the proposed method [14], [31]. DSP such as S-transform 
which is a hybrid of short-time Fourier transform (STFT) and wavelet transform (ST) is the most suitable 
technique as it offers high resolution in time and frequency analysis [32], [33]. 

This paper proposes accurate and fast method to diagnose the type of harmonic sources 
in the distribution system with single point measurement at the PCC by utilizing the machine learning 
techniques [34], [35]. The diagnostic analytic of harmonic sources type is using two popular machine 
learning algorithms, namely, KNN and NB [36], [37]. The KNN is one of the pretty simple and easy to use 
classifier in this world [38]. Unlike other algorithms, KNN directly predicts the test data based on 
the distance measurement on training data, which is computationally less expensive. In this work, the KNN 
with Euclidean distance and k=1 is applied. KNN can usually perform faster to achieve the results and this 
algorithm not only simple but also computationally efficient. Another reliable classifier, NB is implemented 
in this work [39] ,[40]. Given the fact that NB is predicting the classes based on a simple Bayesian theorem, 
NB can be used to identify the multiple harmonic sources in current research. In the present study, the NB with 
normal distribution is adapted [41], [42]. The effectiveness and powerful of machine learning have motivate us 
to implement it in the identification of harmonic sources system. Lastly, the best machine learning algorithm 
is nominated based on the performance measurement criteria for instance the accuracy, precision, specificity, 
sensitivity and F-measure [43]. Besides, the S-transform is used to process the input signals that measured 
at the PCC of the power system network [44]-[46]. 


2. RESEARCH METHOD 

In this section, the utilized machine learning algorithms, which is also known as classifier, will be 
described. This work aims to diagnose type of harmonic sources by using the extracted power quality 
features from both current and voltage signals. Hence, two simple machine learning algorithms, namely, 
KNN and NB are employed. KNN is one of the pretty simple and easy to use classifier in this world. Unlike 
other algorithms, KNN directly predicts the test data based on the distance measurement on training data, 
which is computationally less expensive. In this work, the KNN with Euclidean distance and k=1 is applied. 
Another reliable classifier, NB is implemented in this work. Given the fact that NB is predicting the classes 
based on a simple Bayesian theorem, NB can be used to identify the multiple harmonic sources in current 
research [47]. In the present study, the NB with normal distribution is adapted. 

In current literatures, recommend the execution of the proposed technique can be realized using 
measurement method at the PCC as show in Figures | and 2 using IEEE 4-bus test feeders. In addition, 
the measurement signals are analyzed utilizing S-transform technique [48], and two types of harmonic 
sources are considered in this research comprise of harmonic current source (Hc) and harmonic voltage 
source (Hy) type-load [49]. Four specific cases are considered, which are [50], [51]: 

— Case 1: no harmonic source in the power system (N-N), 


— Case 2: harmonic source located at the downstream (N-H) of the PCC, 
— Case 3: harmonic source located at the downstream and upstream of the PCC (H-H), 


— Case 4: harmonic source located at the upstream of the PCC (H-N). 
The main goal of this research is to identify and diagnose type of harmonic sources in the power system. 
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Where N is non-harmonic source which is resistor load, H is harmonic producing load whereas H 
is Hc or Hy, respectively. Figure 3 shows the overview of the proposed method. Initially, the current and 
voltage signals are measured at the PCC. After that, the S-transform analysis is applied to transform 
the voltage and current signals into time-frequency representation (TFR). Then, the signal parameters 
are then estimated from the TFR and the parameters divided into two feature sets: (1) current feature set, and 
(11) voltage feature set. The feature sets are normalized and then fed into the machine learning for 
the diagnosis of harmonic sources. The KNN and NB are applied in order to diagnose the NN, NH, HH, 
and HN cases for both Hc and Hy. 









PCC | 
Downstream Case 1° 


PCC PCC PCC 


upstream Case 3 Case 4 






Figure 1. Upstream-Downstream for Case 1 Figure 2. IEEE 4-bus test feeders for Case 2, 3 and 4 
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Figure 3. An overview of proposed method 


2.1. Voltage and current feature sets 

In this research, the voltage and current feature sets contain five signal parameters that estimated 
from the voltage and current signals of the PCC, respectively [52], [53]: 
a. The average instantaneous RMS of voltage and current (Vins ave ANd Dims ave) 


The average instantaneous RMS fundamental of voltage and current (Vjjns ave ANd Lips ave) 
The average instantaneous total harmonic distortion of voltage and current (THD,,. and THD iqy¢) 
The average instantaneous total nonharmonic distortion of voltage and current (7nHD,,, and TnHDyqy) 


oan g 


The average instantaneous total waveform distortion of voltage and current (TWD ae and TWD ave) 
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2.2. Performance measurement of machine learning 

In this research, two feature sets, namely, voltage feature set (feature subset made up of voltage 
features) and current feature set (feature subset made up by current features) are used. These features are 
initially estimated from the voltage and current signals. Besides, the min-max normalization method 
is applied to normalize the features in the ranges between 0 and 1, which aims at preventing the numerical 
issue. In this work, M-fold cross-validation manner is implemented for performance evaluation. In M-fold 
cross-validation, the dataset is equally divided into M parts, and each M part is used as testing set 
in succession. At the same times, the other M-1 parts are used for training set [54]. The KNN and NB 
are executed for 10 times each. This study set M=10. For performance measurement, five evaluation metrics, 
namely accuracy, precision, sensitivity, specificity, and F-measure are calculated, and they can be defined as 
follows [55]: 


No. of corrected diagnosed samples 











Accuracy -= Total number of samples d) 
_ TP 
Precision = (2) 
TP+FP 
Sensitivity = TR (3) 
Specificity = aa (4) 
2TP 
F — measure = ————— (5) 
2TP+FN+FP 


where the true positive (TP), true negative (TN), false positive (FP), false negative (FN), which can be 
obtained from the confusion matrix. 


3. RESULTS AND DISCUSSION 

Table 1 shows the results of accuracy, precision, sensitivity, specificity, and F-measure for 
the identification of the harmonic sources using KNN and NB for voltage feature set. A clear representation 
of results can be found in Figure 4. As can be seen, the performance of voltage feature set was very low. 
Even though NB show better results of accuracy, precision, sensitivity, specificity, and F-measure with 
0.4000, 0.3039, 0.3129, 0.9016, and 0.3037, respectively. However, it is clear that the performances of KNN 
and NB were below average (less than 50%), which means the voltage features cannot identify the harmonic 
sources correctly. 


Table 1. The performances of KNN and NB Performance evaluation of voltage feature set 
using voltage feature set a NN 

Evaluation KNN NB 0.9 Ee | 
metrics ee | 

Accuracy 0.2600 0.4000 

Precision 0.1485 0.3039 0.7 J 

Sensitivity 0.1521 0.3129 ae | 

Specificity 0.8789 0.9016 

F-measure 0.1498 0.3072 


Performance 
© 
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Figure 4. Performance evaluation of voltage feature set 


Table 2 displays the results of accuracy, precision, sensitivity, specificity, and F-measure for 
the identification of the harmonic sources using KNN and NB for current feature set. On the other hand, 
the bar representation of results is demonstrated in Figure 5. As can be seen, the results achieved by using 
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current feature set was significantly better than voltage feature set. This is because in the harmonic system, 
the voltage sources are connected in parallel, which only able to produce a very small voltage difference 
in terms of degree. On the contrary, current signals are different for most case due to parallel connection. 
Hence, good performance was achieved when current features are utilized. 

Based on the results obtained, KNN and NB were able to identify the multiple harmonic sources 
in this work. The results show that KNN and NB both perceived the accuracy of 96%. However, as compared 
to NB, KNN has scored higher values of precision (0.9600), sensitivity (0.9587), specificity (0.9941), 
and F-measure (0.9590). Hence, it can be concluded that the performance of KNN was better than NB in 
harmonic sources identification. 


Table 2. The performances of KNN and NB Performance evaluation of current feature set 

using current feature set | l I aon | 

Evaluation KNN NB Ns | | 

metrics 

Accuracy 0.9600 0.9600 

Precision 0.9600 0.9576 | 

Sensitivity 0.9586 0.9557 

Specificity 0.9941 0.9934 

F-measure 0.9590 0.9561 J 


Performance 








Accuray Precision Sensitivity Specificity F-measure 
Evaluation metric 


Figure 5. Performance evaluation of current feature set 


Figure 6 and Figure 7 demonstrate the confusion matrix of KNN and NB for the identification of 
harmonic sources using current feature set. It is worth nothing that the confusion matrix of volateg feature set 
is not presented in this paper due to its worst performance in harmonic source identification. In these figures, 
it shows that KNN and NB were able to identify the harmonic sources very well. Especially for KNN, 
the Hc-N, N-N, N-Hv were perfectly identified (100% class-wise accuracy). With NB, only two classes 
(Hc-Hc and N-Hv were perfectly recognized. Inspecting the results, it can be inferred that KNN was an 
excellent classifier, which can usually offer high class-wise performance in harmonic sources identification system. 





Hc-Hc| 100 3.553 0 0 0.5 0.9174 
100 0 iv 0 0.4785 Hc-N 0 0 0.5025 1 2.152 
0 o B® 0 3.902 0.9569 N-Hc 
” N 
” N) 
& © 
0 0 100 0 0 O NN 
Q ® 
= 2 
= = 
iv 3.349 Hv-Hv 
0 0 N-Hv 
He-He He-N  N-He N-N  Hv-HWv Hv-N N-Hv Hc-He Hc-N  N-Hc N-N  Hv-Hv Hv-N N-Hv 
Predicted Class Predicted Class 
Figure 6. Confusion matrix of KNN using current Figure 7. Confusion matrix of NB using current 
feature set feature set 
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4. CONCLUSION 

In this paper, an excellent power quality system to identify the multiple harmonic sources was 
proposed. Initially, the power quality signals were generated and collected. Afterward, the voltage and 
current features were estimated, and formed voltage feature set and current feature set. The proposed 
diagnostic system implemented machine learning algorithms known as KNN and NB. Based on 
the experimental results, the combination of current features and KNN are more capable to achieve high 
performance in terms of accuracy, precision, sensitivity, specificity and F-measure in this work. In future, 
other popular classifier such as convolutional neural network can be applied for harmonic sources 
identification system. 
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